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Abstract. Social network based approaches to person recommendations are com- 
pared to interest based approaches with the help of an empirical study on a large 
German social networking platform. We assess and compare the performance of 
different basic variants of the two approaches by precision / recall based perfor- 
mance with respect to reproducing known friendship relations and by an empiri- 
cal questionnaire based study. In accordance to expectation, the results show that 
interest based person recommenders are able to produce more novel recommen- 
dations while performing less well with respect to friendship reproduction. With 
respect to the user's assessment of recommendation quality all approaches per- 
form comparably well, while combined social-interest-based variants are slightly 
ahead in performance. The overall results qualify those combined approaches as 
a good compromise. 



1 Introduction 

The term Social Recommender Systems can be understood in a variety of ways. The 
first interpretation of the term may substitute the actors of some sub-network of a social 
network for the set of users with similar rating-behavior as a neighborhood for making 
collaborative recommendations (see e.g. [20|, [4| or Q~3)). 

This approach has been shown to possess certain advantages over traditional col- 
laborative filtering [15] and has been shown to be able to perform as good or better 
at least in taste related domains [13|[4|. The advantages encompass a better perfor- 
mance in certain situations in view of the portfolio effect (a user is recommended items 
which he already knows or which are too similar to those he already knows (see [2])) 
or cold start effects (with respect to ratings and trust [ 12 1). New influences (structurally 
or radically new recommendations) can enter the information space of a user through 
sub-networks of her social network (horizon broadening effect) as easy of even easier 
than through groups of similar rating but otherwise unrelated users. But the external 
intelligence injected into the system by explicifying and using direct or indirect social 
relations as in social networking platforms may ensure a somewhat higher probability 
of relevance of those radically new and unexpected, horizon broadening recommenda- 
tions for the user. Reasons for this can be normative effects in groups 1 6 1 [ 19 1 ("I should 
know and like what my peer group likes") or trust effects and easier explanations for 
recommendations via the social network ("I trust and know how to value this album 
recommendation based on Mark's, Jenny's and Yiming's musical information space be- 
cause I know them and their relation to me and their function, role, position etc. in the 



network"). There may, however, be some cases where social recommenders in this first 
interpretation may be less useful than recommendations by "network" of anonymous 
but similar rating users (e.g. in case of recommendations of scientific papers) where 
implicit "topical" relations to these users are exploited. 

A second interpretation of the term Social Recommender System may encompass 
recommending items not to single users but to whole groups of users. In this interpre- 
tation the target of the recommendation (the group) is a socially defined concept (3j . 

A third interpretation of the term may make persons or groups of persons the rec- 
ommended entities, either using social filtering (as discussed above in the first interpre- 
tation of the term), conventional collaborative filtering, content-based filtering (using 
any accessible electronic representation of the recommendable persons or groups as 
a basis for similarity computations) etc.. One example for this interpretation are team 
recommendation systems (see e.g. [5]) where teams are recommended (e.g. to HR ad- 
ministrators) especially in situations, where the number possible team configurations is 
very high (such as Open Innovation scenarios). 

This contribution will deal with a flavor of third interpretation: In a social network- 
ing platform recommending potentially interesting other users to a user based on mutual 
interests. While user recommendations on the basis of the social network are quite com- 
mon today (consider e.g. the friend recommendations in Facebook), the problem of how 
to assess and incorporate user's interests into the recommendation in a simple and ex- 
pressive way is still subject to research. In this article, we investigate the question, how 
simple interest based person recommendation approaches performs in contrast to social 
network based recommendation approaches. 

In Section[2]we review more related work concerning social recommender systems. 
Section[3jdescribes the setting of our study, the data-set we use, and the range of recom- 
mendation methods we investigate. Sectionfflthen presents and discusses the results. On 
the one hand we compute performance measures on the basis of reproduction of known 
friendships as indicators of the usefulness of the recommenders. Furthermore the results 
of an empirical study among our test-users is presented and discussed. In the conclu- 
sion, we summarize the overall results and shortly discuss and compare the implications 
for the various approaches to person recommendations. 

2 Related Work 

Currently the most common example for people recommendation is people match- 
making in social networks. Popular application domains therein are dating platforms 
or expert finders. The first, may consider also preferences on different scales besides 
traditional demographical data such as age, gender, etc. The second considers skills, 
competencies and expertise acquired from various sources in order to recommend an 
expert that could provide suggestions and help for a specific problem. 

In online dating platforms persons usually fill in data that should describe them- 
selves appropriately in order to find partners that match their person in terms of de- 
mography (e. g. age, gender, height), interests (e. g. para-gliding, watching movies) and 
preferences (e. g. rock music, smoking). A few of them allows for entering a so called 
"target profile" that represents the description of the person one would like to receive 



as recommendation. Fiore et al. for instance, investigated which data within a profile 
influences the perceived attractiveness by women and men by correlating the perceived 
attractiveness with various elements of a user's profile (photos, free-text components 
and fixed-choice components) [ 1 1 1 . 

Diaz et al. [9| describe the match-making problem from an information retrieval 
perspective and propose a novel approach for the combination of user profiles to im- 
prove the relevance of recommendations. There, features are extracted from a user pro- 
file (e. g. free-text descriptions) and used as input for a machine learning algorithm that 
selects the most important predictors for good matches. Good matches were considered 
those matches where bilateral user interaction could be identified or the same features 
applied as in the conditions where a bilateral contact occurred (cp. labeled vs. predicted 
relevance). 

Expert finders are a different domain for person recommendations. There primarily 
competencies are regarded in order to increase the probability to find a solution for 
an occurred problem. In the simplest case, the task of finding experts can be solved 
with simple database queries. However, this does not always entail satisfactory results 
due to the difficulty to formulate appropriate queries and because skills may not be 
the only criterion for searching. McDonald and Ackerman ifTTl for instance tried to 
model current best practices for finding experts in a large company and mapped these 
heuristics to a corresponding system. 

In another work McDonald augmented this system with two different social net- 
works: one based on workplace sociability, which represents how often individuals so- 
cialize with each other, the other based on shared workplace context, which represents 
logical work groups and work context over organizational boundaries [16|. His work 
emphasized that it is challenging to mix skills and social networks in recommendations 
because users perceive a trade-off between the two: more precisely even though the sys- 
tem looks first for experts and ranks them afterwards according to the social network, 
the users think they get only recommended due to the latter aspect. A related system 
described by Ehrlich et al. ifTUll was used (among other functionalities) to recommend 
experts searched by keywords within a specific social distance in a user's social net- 
work. Furthermore, this work addresses also other aspects such as privacy, acceptance 
and usage of such systems. 

Guy et al. [14] describe a slightly different way of recommending persons. Their 
approach bases on the collection of data (from blogs, social bookmarking, etc.) in a 
company's internal intranet in order to make suggestions for adding people that may 
belong to its social network but were not explicitly added to the social networking 
platform. 

Regarding person recommender system in enterprise-internal social networking plat- 
forms, Chen et al. [8] developed a person recommender based on keyword extraction 
algorithm, that tries to extrapolate user interest from user contributions. This approach is 
very valuable as foundation for our purpose, even if the findings of enterprise-internal 
platforms can not be inherently applied to comminties of interest such as the Utopia 
community. Additionally, in contrast to Chen et al. we rather want to extrapolate user 
interests from all user activities performed in the predefined topic categories provided 
by the platform and hence investigate whether this kind of categorization technique is 



suitable as background data for interest-based person recommendations. As a last as- 
pect, the recommendation provided by Chen et al.'s system, may have a different kind of 
utility or goal (see Section[3]) because of the business domain the community is situated 
in. 

An extensive review on social matching systems is provided by Terveen and Mc- 
Donald ETI that additionally formulate claims and related research questions in this 
field. This work can be used to derive guidelines for the development of systems that 
incorporate social aspects (especially social networks) to find appropriate matches. 

3 Data-Set and Methods 

We had access to the complete database from the German based social networking plat- 
form Utopia.de [1]. The main purpose of the platform is the collaborative promotion, 
discussion and development of ideas and concepts contributing to more environmental 
sustainability. The platform provided the usual set of services and data-elements, like 
private messaging, discussion boards or blogs and personal profiles, which, in contrast 
to platforms targeted at self presentation, are rather sparse and use few pre-determined 
elements. In contrast to social network based friend recommendations (which we will 
refer to in the following as friend of a friend (FoF) recommendations) being widespread 
in Social Networking platforms or content-based recommendations comparing only the 
profiles directly, we are interested in recommending users other users on the basis of 
their interests reflected by their actions and content on the platform. Thus the sparse 
profiles are not a problem. 

Besides user generated contributions, the platform also contains editorial material 
which can be commented upon by the users. Furthermore, users can express positive 
attitudes toward a contribution by assigning it a "worth living" point. Instead of a free 
social tagging system, the platform has eleven content categories (Ci, . . . , Cn) like 
e.g. "Health and Diet", "Construction and Renovation" that users can attach to any con- 
tribution, which can be viewed as a simple form of tagging with a fixed tag set. Social 
tagging based person recommender systems (e.g. 1 22 1 1 1 8 ]), in general, recommend per- 
sons according to similar tagging behavior. In contrast to classic Collaborative Filtering 
(CF) which basically uses similarity measures on the columns of the user-item-rating 
matrix R{ U i) for neighborhood creation towards recommending items, these systems 
use the user-tag-item matrix Ts ut j\ to identify users with similar tagging behavior for 
recommending these persons (e.g. as a means of expert finding). Classic CF belongs to 
a class of recommending approaches that use explicit ratings of items (for item recom- 
mendation) or of persons (for person recommendation), whereas social tagging based 
approaches belong to a class of methods that use implicit methods. Implicit methods 
induce user attitude towards items or similarity to other users indirectly from their be- 
havior on the platform (e.g. frequency of accessing certain contributions) or the content 
in their information spaces (their contributions on the platform or their profile, which 
can be compared using techniques from information retrieval (e.g. tf-idf vectors and 
cosine similarity)). 

For person recommendations, users with similar tagging behavior can be consid- 
ered to have "similar interests" and are thus candidates for being recommended. In 



essence, this interpretation is not necessary. The term "users with similar interests" can 
be considered synonym for "users that are similar with respect to their behavior on the 
platform and / or their information spaces". 

In contrast to having to compare users by comparing matrices as sim(u\,U2) = 
sim(T Ul { ti },T U2 { ti }) as in social tagging based person recommenders, a simpler ap- 
proach is to count all platform activities of a user related to a certain category C\ ("cat- 
egorized activities"). Such categorized activities can be the creation or commenting of 
a content item (e.g. a blog entry) or the assignment of a "worth living" point. For each 
user u, these counts acti are then normalized with the total number of categorized ac- 
tivities J2i acti, to yield the normalized categorized activities Ai — acti/ S» acti. We 
can then compare these vectors as sim(u\, 112) = sim(A Ul u\ , A U2 ^) to yield a sim- 
ilarity measure for users. For the actual comparison of the vectors, we use standard 
cosine similarity and Pearson correlation. A minimum total number of categorized ac- 
tivities is necessary to be included in the matrix. From the resulting similarity matrix, 
we recommend users with a similarity above an adjustable threshold value that were 
not already "friends". 

Unfortunately, in the absence of a free social tagging mechanism in the platform 
we cannot compare the performance of our approach against the social tagging based 
person recommenders discussed before. 

An interesting question regarding the validation of person recommender approaches 
is the performance metric to be used. At this point a diversification of the different rec- 
ommender goals should be done since two possible goals can be generally pursued. One 
possible evaluation method is to evaluate whether the user accepted the recommenda- 
tion (for instance by clicking on the recommended item). The other possible evaluation 
is whether the recommendation itself is useful, i. e. if the person recommended in fact 
does fulfill a user's expectations (with respect to a predefined goal such as e. g. for a 
friendship, as discussion partner, as expert). Obviously, a strict diversification of goals 
is not possible, because (i) the goals are conceptually not completely disjunct and (ii) 
from a technical point of view platforms do not provide this diversification for classi- 
fying users. Thus, friendships in social networks are treated as a sort of "bookmarks" 
to find persons with respect to all the above mentioned goals. In the Utopia case, that 
can be regarded as a community of interest, we want people to get in contact aiming 
at finding new discussion partners such that interesting hints and suggestions related to 
the topics discussed in the Utopia platform can be better exchanged. 

However, as mentioned the target platform does not provide any diversification con- 
cerning this aspect. For this reason, and knowing that the goals for person recommenda- 
tion in our case overlap, as a measure for the potential success of the approach we had 
to choose the reproduction rate of friendship ties that already exist in the platform as 
evaluation criterion. Therefore, and also in order to compare the approach against a FoF 
based approach, we exclude members with less than 3 friends and less than 8 friends of 
friends within the test group. The minimum total number of categorized activities was 
set to 3. The resulting test group encompassed 334 users with 3984 friendship relations. 
The mean number of friends was 11.93. mean number of FoF was 270.31 and mean 
number of categorized activities was 87.56. ~31% of the test users had only 3 or 4 
friends, and only ~17% of the users had only three or four categorized activities. 



The FoF Recommender that is similar to the friend recommenders used in common 
Social networking platforms (see Section[2]i and that we compare our approach against, 
recommends a person U\ to a person u 2 in proportion to the number of common friends 
fu 1 Au 2 relative to the average total number of friends of both users 0.5(/ Ul + f U2 )- 
The FoF similarity or recommendability of u\ and u 2 is thus given by sim(ui, u 2 ) = 

^Ju 1 f\U2l \JUi + Ju 2 )- 

We used 10-fold cross validation in our experiments: For each of 10 runs of all of 
the recommender approaches that we compare, we leave out one tenth of the friend- 
ship relations and use the remaining nine tenth of the friendship relations to compute 
FoF Recommendations. The data basis for most of the variations of the basic interest 
based recommendation approach, which will be discussed in the next section, remains 
constant. 

We then compute the n = 10 best recommendations for each user and each ap- 
proach and measure how many of the deleted one tenth friendship relations are "repro- 
duced" by the recommender. If we recommend a total of A persons in one run and have 
398 deleted friends per run, we can determine the true and false positives (TP and FP) 
and the false negatives FN and have A = TP + FP and FN = 398 - TP. We can 
then compute Precision, Recall and F-Measure as usual as measures of the success rate. 

If the random 10-fold partitioning of the friendship relations deletes less than 1 or 
more than 10 friendship relations for a single user, we do not compute recommenda- 
tions for this user. In these cases we cannot determine the success rate analogous to the 
"regular" cases. Thus from the 334 (users) * 10 (runs) = 3340 cases we only compute 
recommendations for 1921 of these cases. Since for each case we recommend the top 
n = 10 best recommendations we make 19210 recommendations in total. If we recom- 
mend a person that is already a friend in the respective nine tenth relation data set, we 
drop this recommendation. 

With this procedure, we can, of course never reach precision values of 1, simply 
because we delete on average in each run only 2.07 friend relations and recommend al- 
most always 10 persons. However, these restrictions apply to all recommenders equally. 

4 Results and Discussion 

A general strength of the proposed approach can be that, in contrast to social network 
based approaches like FoF, a user does not need a friend-list, but a weak point is that 
passive users that do not perform many explicit actions will not acquire a meaningful 
A{ji vector. 

Table [T] shows the basic results of the experiment. What we see from the table is 
that the interest based recommender approach is significantly better than random in re- 
producing pre-existing friendships. The FoF approach is even significantly better. This 
can be attributed to the fact that even in a platform that is mainly targeted towards ex- 
change of content in view of a narrower field of interest (a typical community of interest 
Q), friendship relations are perceived mainly as something social and not so much as 
something content or interest related. The formation of social friendship ties will ob- 
viously be strongly influenced by the friend of a friend effect and can thus much more 
easily be reproduced by the FoF recommender. However, our approach does not aim 
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Random 


19210 


144 


0.008 


0.036 


0.012 


Interest Based Pearson 


19048 


250 


0.013 


0.063 


0.022 


Interest Based Cosine 


19210 


283 


0.015 


0.072 


0.024 


FoF 


19132 


1164 


0.061 


0.294 


0.101 


Interest Based Pearson plus link 


19048 


376 


0.020 


0.095 


0.033 


Interest Based Cosine plus link 


19210 


422 


0.022 


0.107 


0.036 



Table 1. Results of the experiment. 10-fold cross validation: precision, recall and f-measure av- 
eraged over 10 runs. 



at recommending friends in the mere social sense, but rather at recommending persons 
that are related via interests in platforms where the exchange in terms of content is the 
main goal as opposed to platforms where self -presentation and socially related commu- 
nication is predominant. Of course, the social sphere and the interest based sphere are 
closely related. 

An attempt to nevertheless improve our interest based approach, we investigated, if 
weighting different sorts of categorized actions differently can make a difference (e.g. 
by giving creating a long blog-entry a higher weight than just assigning some item a 
"worth living" point). The variations only very slightly improved the performance (e.g. 
in the Pearson case plus 4% for precision), which does not allow for any significant 
conclusions. 

In QO, authors were able to improve a content based item recommender system by 
additionally taking into account social relations between the item's owners. In accor- 
dance to that we also investigated, in how far our interest based person recommendation 
approach may profit from combining it with a social relation based component. We do 
this by multiplying the relevant (> 0.5) interest based similarity scores between two 
users with a factor of 1.5 if the two users have at least one common friend. Thus we 
effectively augment the interest based approach with the FoF approach (and not vice 
versa). Thus the general advantages of our approach discussed before can be main- 
tained. We called these variations "plus link" and the results are also shown in table [T] 
As expected, we see that the approach can profit from this augmentation by approxi- 
mately 50 % increase of performance. However, it has to be stressed that, as discussed 
before, the performance with respect to reproducing existing friendship ties is certainly 
not our main goal and not the only quality criterion of an interest based person recom- 
mender in our sense. 



Q-nr Question 



Scale 



1 



Do You know this user already? 



yes / no 



Are You interested in getting to know this user? 
Space for comments on the recommendations 



5 point Lickert 
Free text field 



Are You generally interested in getting to know new users on Utopia? 5 point Lickert 



Would You like to be recommended users on Utopia? 
Space for general comments 



5 point Lickert 
Free text field 



Table 2. Online survey: questions 



4.1 Empirical study 

In order to address this issue, we conducted an online empirical study among our test 
users, dividing them into three groups and providing them with 5 person recommenda- 
tions using one sort of recommender in each group (Interest based cosine, interest based 
cosine plus link and FoF). The users were asked to evaluate the recommendations ac- 
cording to several criteria (see table|2]i. General statistics with respect to this survey are 
shown in table [3] 



Recommender 


Completed ques- 
tionnaires 


Percent completed 


Cosine 


28 


28.3 % 


Cosine plus link 


35 


35.4 % 


FoF 


36 


36.4 % 



Table 3. Online survey: general statistics 



The recommendations were provided in the form of picture and username of the 
recommended person as specified in the platform. By clicking on either username or 
picture, the profile page of the corresponding person could be inspected in order to 
identify possible interesting characteristics of that person. Based on this knowledge the 
user can decide whether the recommendation proposed is appropriate or not. 



Recommender 


rec. person known 


rec. person un- 
known 


# of 
rec. 


Cosine 


40,7% (57) 


59,3% (83) 


140 


Cos. plus Link 


41,1% (72) 


58,9% (103) 


175 


FoF 


57,8% (104) 


42,2% (76) 


180 


Overall 


47,1% (233) 


52,9% (262) 


495 



Table 4. Results of question 1 



Table [4] shows that the FoF variant is more likely to recommend already known 
persons which is socially plausible. The overall high number of recommendations of 
already familiar persons can be explained by the fact that due to the selection scheme of 
the 334 users (see previous section), already very active users were selected that have a 
high probability of knowing each other. 



Rec. 


Quest. 1 


Question 2 


1 


2 


3 


4 


5 


Cos. 


(unknown) 


24.1% 


16.9% 


34.9% 


16.9% 


7.2% 


(known) 


1.8% 


3.5% 


54.4% 


26.3% 


14.0% 


Overall 


15.0% 


11.4% 


42.9% 


20.7% 


10.0% 


Cos. 
pi. Ink. 


(unknown) 


5.8% 


27.2% 


28.2% 


27.2% 


11.7% 


(known) 


22.2% 


6.9% 


23.6% 


26.4% 


20.8% 


Overall 


12.6% 


18.9% 


26.3% 


26.9% 


15.4% 


FoF 


(unknown) 


9.2% 


19.7% 


40.8% 


14.5% 


15.8% 


(known) 


15.4% 


1.9% 


33.7% 


26.9% 


22.1% 


Overall 


12.8% 


9.4% 


36.7% 


21.7% 


19.4% 


Overall 


(unknown) 


12.6% 


21.8% 


34.0% 


20.2% 


11.5% 


(known) 


14.2% 


3.9% 


35.6% 


26.6% 


19.7% 


Overall 


13.3% 


13.3% 


34.7% 


23.2% 


15.4% 



Table 5. Cross-table question 1 (Familiarity) and question 2 (Interestingness) 



Table [5] shows the relation between previous familiarity and the rating of interest- 
ingness. We see that the error of central tendency is present throughout the results of 
question 2. It is overall slightly more present for the recommender that does not make 
use of the social network (cosine). However, for the recommenders that make use of the 
social network (FoF and cosine plus link) this tendency is slightly more prominent for 
the unfamiliar recommended persons than for the familiar, while for the recommender 
that is purely interest bases (cosine) this slight effect is reversed. As an explanation, 
knowing a person may make it easier to come to an expressive estimation apart from 
the less meaningful middle rating. However, it also has to be taken into account that 
a main value for a recommender is to recommend new entities (persons in our case), 
where the use of these novel recommendations often can only be properly assessed a 
posteriori. 

The results of the general questions of the questionnaire are shown in tables 17] and 
[6] For question 4, we see that, according to expectation, the tendency to be interested 
in getting to know new people on a social networking platform is quite high. There 
are no significant differences among the three test-groups. With respect to question 
5, we see that the recommendation service is regarded as overall positive but judged 
more critically (23.3 % negative (rating 1 or 2) answers in question 5) compared to the 
general predisposition to be interested in getting to know new people (7.1 % negative 
answers in question 4). However, the share of positive answers (rating 4 or 5) among 
the group which were confrontend with the recommendations from the merely interest 



Recommender 


1 


2 


3 


4 


5 


Cos. 


21.4% 


3.6% 


35.7% 


21.4% 


17.9% 


Cos. pi. Ink. 


20.0% 


2.9% 


17.1% 


45.7% 


14.3% 


FoF 


11.1% 


11.1% 


16.7% 


41.7% 


19.4% 


Overall 


17.2% 


6.1% 


22.2% 


37.4% 


17.2% 



Table 6. Results of Question 5 (General interest in person recommendation service) 



Recommender 


1 


2 


3 


4 


5 


Cosine 

Cos. -plus-Link 
FoF 


0.0% 
2.9% 
2.8% 


3.6% 
8.6% 
2.8% 


32.1% 
28.6% 

27.8% 


35.7% 
31.4% 
30.6% 


28.6% 
28.6% 
36.1% 


Overall 


2.0% 


5.1% 


29.3% 


32.3% 


31.3% 



Table 7. Results of Question 4 (General interest in getting to know new people) 



based recommender (cosine) is significantly lower (39.3 %) than for the groups that 
were confronted with recommendations that included the social network (60.0 % and 
61.1 %). This can be seen as a hint that the social network plays an important role for 
people recommendations. However, the effect that the use of novel recommendations 
often can only be properly assessed a posteriori needs to be taken into account here 
as well, because according to table HI the cosine recommender proposes more novel 
recommendations than the FoF recommender. The cosine plus link recommender that 
results in roughly the same share of novel recommendations as the cosine recommender 
appears to be a good compromise in view of this phenomenon. 

5 Conclusion 

From our study it can be concluded that in social networking platforms, person rec- 
ommenders are services that have some potential to deliver an added value for a large 
number of users. Purely interest based recommenders may produce more novel recom- 
mendations than purely social network based recommenders. With respect to the survey 
rating of test-users, the purely interest based approaches perform slightly worse than the 
purely social network based approaches. The over-proportionally good performance of 
the FoF approach in reproducing known friendships can be attributed to social effects 
and does not have to be taken as a definitive quality criterion. Mixed approaches yield 
many novel recommendation while (with respect to user rating) perform as good or 
even slightly better than the purely social network based approach. Combined social 
network based and interest based approaches may thus be a good compromise and a 
promising field of future research. 
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